Changing the Support of a Spatial Covariate: A Simulation Study
نویسندگان
چکیده
Researchers are increasingly able to capture spatially referenced data on both a response and a covariate more frequently and in more detail. A combination of geostatisical models and analysis of covariance methods may be used to analyze such data. However, very basic questions regarding the effects of using a covariate whose support differs from that of the response variable must be addressed to utilize these methods most effi ciently. In this experiment, a simulation study was conducted to assess the following: (i) the gain in effi ciency when geostatistical models are used, (ii) the gain in effi ciency when analysis of covariance methods are used, and (iii) the effects of including a covariate whose support differs from that of the response variable in the analysis. This study suggests that analyses which both account for spatial structure and exploit information from a covariate are most powerful. Also, the results indicate that the support of the covariate should be as close as possible to the support of the response variable to obtain the most accurate experimental results. Changing the Support of a Spatial Covariate: A Simulation Study Tisha Hooks,* Jeff rey F. Pedersen, David B. Marx, and Roch E. Gaussoin T. Hooks and D.B. Marx, Dep. of Statistics, Univ. of Nebraska–Lincoln, Lincoln, NE 68583-0963; J.F. Pedersen, USDA-ARS, NPA Grains, Bioenergy, and Forage Research, Univ. of Nebraska–Lincoln, Lincoln, NE 68583-0897; R.E. Gaussoin, Dep. of Agronomy and Horticulture, Univ. of Nebraska–Lincoln, Lincoln, NE 68583-0915. Received 25 July 2006. *Corresponding author ([email protected]). Abbreviations: AICC, corrected Akaike information criteria. Recent advances in precision agriculture have provided researchers with the ability to collect various measurements such as infrared and visible light refl ectance data (Servilla, 1998), which are indicative of such factors as moisture status during various stages of crop development (Bryant et al., 2003), and “on-thego” data during harvest such as electrical conductivity readings (McGuire, 2003), yield, and test-weight readings (Wehrspann, 2000). Similar data are also available from satellite imagery (Frazier et al., 2004). These data points are typically associated with extremely dense spatial coordinates, thus creating the opportunity to use these measurements as covariates for the primary response variable to possibly increase experimental precision. As technologies continue to improve concerning on-the-go data collection and the precision of imagery, the importance and potential impact of utilizing such data in planned experiments will increase. In addition to the growing availability of massive amounts of spatially coordinated data, researchers have witnessed a rapid increase in the speed and power of computers, which allows researchers to eff ectively manage such data. Also, a collection of geostatistical models allows the researcher to both characterize and account for the underlying spatial patterns in their data, leading to potentially more precise estimation ( Journel and Huijbregts, 1978; Isaaks and Srivastava, 1989; Cressie, 1993). A good introduction to geostatistical methods is given by Littell et al. (1996) in a chapter dealing with spatial variability. Ultimately, the parallel Published in Crop Sci 47:622–628 (2007). doi: 10.2135/cropsci2006.07.0490 © Crop Science Society of America 677 S. Segoe Rd., Madison, WI 53711 USA R e p ro d u c e d fr o m C ro p S c ie n c e . P u b lis h e d b y C ro p S c ie n c e S o c ie ty o f A m e ri c a . A ll c o p y ri g h ts re s e rv e d . CROP SCIENCE, VOL. 47, MARCH–APRIL 2007 WWW.CROPS.ORG 623 increases in computer technology and the ability to collect (or access) spatial data has created a need for research on how to manage data and use resources effi ciently. As mentioned, these large data sets provide researchers with the opportunity to use some measurements as covariates, thereby improving estimation by utilizing information about one variable that is contained in another. Analysis of covariance methods are used to analyze such data (Searle, 1971). However, basic questions need to be answered regarding the use of intensively collected data points as covariates in analyzing data collected over an entire plot (e.g., yield) or data collected at a single point in a plot to represent the entire plot (e.g., soil chemistry from a soil probe). For example, would greater experimental precision be obtained by utilizing all intensively collected data points from a plot as covariates for a trait such as yield, or would some subset of the intensively collected data points provide greater experimental precision? This question is related to what is widely known in geostatistics as the change of support problem (Olea, 1991; Cressie, 1993; Schabenberger and Gotway, 2005). The support of the data refers to the length, area, or volume that a measured datum represents. Note that in many cases the data is collected at a single point; thus, it is said to have “point support.” If all intensively collected data points are utilized in the analysis (e.g., by obtaining a block average of all data points included in a plot and using the block average as the new variable), then the support of the data has been changed. Eff ectively, this block average is a new variable, and the statistical and spatial properties of this new variable diff er from those of the original. In particular, the spatial structure and parameters such as the range and sill of the corresponding semivariogram for this new variable are altered. For example, a variable of point support may be associated with a semivariogram such as the one shown in Fig. 1. This particular example illustrates a spherical model. This is one of the geostatistical models referenced earlier, which allows us to characterize and account for underlying spatial variability. In general, the semivariogram is a measure of the average dissimilarity between data separated by a distance h. Note that since this function is a measure of dissimilarity, we see that the value of the semivariogram increases with lag distance h. The parameters of the semivariogram are the range, sill, and nugget. For the spherical semivariogram, the range is defi ned as the critical distance above which observations become independent and beyond which the model function returns a constant value, the sill. The sill is equal to the variance of independent observations. Finally, the nugget describes microscale variation that may cause a discontinuity at the origin. Changing the support of the data by averaging over observations will change these parameter values and may possibly change the results of the analysis considerably (Clark, 1979). In addition to the conventional change of support problem, questions also arise regarding the eff ects of conducting an analysis of covariance when the response variable and the covariate are of diff erent supports. To answer such questions, one would ideally know the true values of the treatment and response variables along with their spatial structure and conduct numerous replicates of each experiment to place confi dence in the results. Simulation studies provide such capacity. Therefore, the objectives of this research were to conduct a simulation study to explore (i) the gain in effi ciency when methods that exploit spatial structure are utilized, (ii) the gain in effi ciency when methods that exploit information from a covariate are utilized, and (iii) the eff ects of including a covariate whose support diff ers from that of the response variable in the analysis. MATERIALS AND METHODS The simulated experiment consisted of fi ve replications of fi ve treatments. The treatments were laid out in a completely randomized design on a 5 × 5 arrangement of plots and were randomly assigned to the plots in each iteration. Within each plot, another 5 × 5 grid of points was constructed (Fig. 2). For each of the 625 points, both a spatial fl oor (Y ) and a spatial covariate (X ) were generated using the method of Gaussian cosimulation (Oliver, 2003). This method is described as follows: let = + 1 1 1 Y L μ Z and = + ( + ) 2 2 2 1 2 1 X L μ ρ ρ Z Z Figure 1. Example of a spherical semivariogram. Figure 2. Layout for data generation. R e p ro d u c e d fr o m C ro p S c ie n c e . P u b lis h e d b y C ro p S c ie n c e S o c ie ty o f A m e ri c a . A ll c o p y ri g h ts re s e rv e d . 624 WWW.CROPS.ORG CROP SCIENCE, VOL. 47, MARCH–APRIL 2007 where L 1 and L 2 are the square roots (which can be obtained via methods such as the Cholesky decomposition and spectral decomposition) of given covariance matrices, μ 1 and μ 2 are the means of Y and X, and Z 1 and Z 2 are vectors of independent normally distributed random variables with mean 0 and variance 1. Then, it is easily shown that the covariance of Y is cov(Y )=L 1 L 1 ', the covariance of X is cov(X)=L 2 L 2 ', and the cross-covariance of Y and X can be written as cov(Y,X)=ρ L 1 L 2 ' (Oliver, 2003). Note that the parameter ρ (–1≤ ρ≤ 1) determines the strength of the relationship that exists between the spatial fl oor and the covariate. In this simulation, the spherical covariance function with a nugget of zero was used for the construction of both variables. The function is as follows: { } ( ) + ( ) ( ) = > 3 1 2 2 2 3 1if 0 0 if h h a a h a C h h a ⎧⎪⎪ σ ≤ ≤ ⎪⎪⎩ where h is the distance between observations, a is the range of the corresponding spherical semivariogram, and σ is the sill of the semivariogram. Y was simulated with a range of 25 and a sill of 5. X, the covariate, was simulated with a range of 15 and a sill of 5. Finally, two correlation values were considered when modeling the cross-covariance between the spatial fl oor and the covariate so that both a weak and a strong relationship between the two variables could be considered: ρ=.3 and ρ=.7. Treatment eff ects were generated with the following treatment vectors τ:
منابع مشابه
Spatial Regression in the Presence of Misaligned data
In this paper, four approaches are presented to the problem of fitting a linear regression model in the presence of spatially misaligned data. These approaches are plug-in method, simulation, regression calibration and maximum likelihood. In the first two approaches, with modeling the correlation between the explanatory variable, prediction of explanatory variable is determined at sites...
متن کاملAssessment of spatial variability of cation exchange capacity with kriging and cokriging
Cation exchange capacity (CEC) is one of the most important soil attributes which control some basic properties of soil such as acidity, water and nutrient retaining capacity. However, the measurement of cation exchange capacity in large areas is time consuming and requires high expenses. One way to save time and expenses is to use simple soil covariates and geostatistical methods in mapping CE...
متن کاملThe Effect of Simulation Chemistry Training on Spatial Ability and Problem Solving Skills for Tenth Year Female Students in Tehran
Aims: One of the contexts that can play an important role in teaching and learning is educational simulation. Given the benefits of this technology, the present study investigates the impact of chemistry education using educational simulation on the spatial ability and problem solving skills of tenth grade female students in Tehran. Methods: This is a quasi-experimental study with pretest-postt...
متن کاملSimulation and Evaluation of Urban Development Scenarios Using Integration of Cellular Automata Model and Game Theory
Urban growth is a dynamic and evolutionary spatial and social process that relates to the changes of urban spatial units and the transformation of people’s lifestyles and consequently demographic changes. Considering the urban development process as a function of land uses interactions, population structure and the strategic behavior of the agents involved in the urban development process (the ...
متن کاملAnalysis of spatial variability of soil properties using geostatistics and remote sensing
Soil mapping is one of the basic studies in the natural resource sector. The purpose of this study was to analyze spatial of soil properties on the map of arid areas and deserts. For this purpose, a region with an area of 600 hectares in Qom that considered Salt Lake watershed. Specified methods used include inverse distance methods, radial functions, and prediction local general estimate. Krig...
متن کاملEffects of Digital Elevation Models (DEM) Spatial Resolution on Hydrological Simulation
Digital Elevation Model is one of the most important data for watershed modeling whit hydrological models that it has a significant impact on hydrological processes simulation. Several studies by the Soil and Water Assessment Tool (SWAT) as useful Tool have indicated that the simulation results of this model is very sensitive to the quality of topographic data. The aim of this study is evaluati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016